Goto

Collaborating Authors

 binocular vision


CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing Systems

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN^{2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye. CNN^{2} uses novel augmentation, pooling, and convolutional layers to learn a sense of three-dimensionality in a recursive manner. Empirical evaluation shows that CNN^{2} has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN^{2} is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications.


Why do horses have eyes on the side of their head?

Popular Science

Why do horses have eyes on the side of their head? 'You often have to teach horses something on both sides of their body for them to process the information fully.' In the animal kingdom, horses are prey. Breakthroughs, discoveries, and DIY tips sent every weekday. Have you ever noticed that horses have eyes on the sides of the head rather than the front, like we do as humans? The location of horses' eyes offer a biological advantage that helps keep them safe as prey animals.


Reviews: CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing Systems

Originality: To my knowledge, the motivation for such dual-pathway design is not new. But the particular design of this paper, CM polling in particular, is definitely novel. Quality: I think the evaluation of this work is quite thorough, but missing some important items. It seems that using CM pooling in vanilla CNNs is not not shown in the paper. This makes it less clear if the this pooling actually improves the performance of vanilla CNNs. 2. Missing Vanilla CNN tuning details.


CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing Systems

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN {2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye.


Low-cost Stereovision system (disparity map) for few dollars

arXiv.org Artificial Intelligence

The paper presents an analysis of the latest developments in the field of stereo vision in the low-cost segment, both for prototypes and for industrial designs. We described the theory of stereo vision and presented information about cameras and data transfer protocols and their compatibility with various devices. The theory in the field of image processing for stereo vision processes is considered and the calibration process is described in detail. Ultimately, we presented the developed stereo vision system and provided the main points that need to be considered when developing such systems. The final, we presented software for adjusting stereo vision parameters in real-time in the python language in the Windows operating system.


Self-Calibrating Active Binocular Vision via Active Efficient Coding with Deep Autoencoders

arXiv.org Artificial Intelligence

We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements. The model follows the principle of Active Efficient Coding (AEC), a recent extension of the classic Efficient Coding Hypothesis to active perception. In contrast to previous AEC models, the present model uses deep autoencoders to learn sensory representations. We also propose a new formulation of the intrinsic motivation signal that guides the learning of behavior. We demonstrate the performance of the model in simulations.


CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing Systems

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN {2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye.